Skip to content

Conversation

@stmh
Copy link
Member

@stmh stmh commented Sep 11, 2025

No description provided.

stmh added 30 commits August 16, 2025 18:36
Add comprehensive OAuth2/OIDC authentication support using oauth2-proxy
and GitLab as identity provider, alongside existing bearer token auth.

Core Changes:
- Implement hybrid authentication system supporting dev/oauth/bearer modes
- Add OAuth2-proxy integration with Traefik reverse proxy setup
- Update frontend to detect and adapt to different authentication modes
- Extend API middleware to handle OAuth headers and session management

New Features:
- Development mode: No authentication for local development
- OAuth mode: GitLab OIDC via oauth2-proxy with cookie-based sessions
- Bearer mode: Traditional API token authentication (existing behavior)
- Complete Docker Compose development environment with profiles
- Frontend auth mode detection with appropriate UI changes

Configuration:
- Add AuthMode enum with dev/oauth/bearer variants in scotty-core
- Extend ApiServer settings with OAuth redirect URL and dev user config
- Update CurrentUser struct to include email, name, and optional access token
- Add oauth2-proxy configuration with GitLab OIDC integration

Development Setup:
- Docker Compose profiles for different authentication modes
- Helper scripts for easy local development and testing
- Complete OAuth development example in examples/oauth2-proxy/
- 1Password integration for secure secret management

Documentation:
- Comprehensive CLAUDE.md with architecture and development guidance
- OAuth testing documentation in LOCAL_OAUTH_TESTING.md
- Configuration examples for all authentication modes

Testing:
- Full OAuth flow validation with Playwright browser testing
- Traefik service discovery and routing verification
- End-to-end authentication testing across all modes

This implementation provides a production-ready OAuth authentication
system while maintaining backward compatibility with existing bearer
token authentication.
BREAKING CHANGE: API endpoints restructured with protected resources moved to /api/v1/authenticated/* namespace

Enable secure GitLab OIDC authentication for Scotty's SPA interface while maintaining CLI bearer token support. This production-ready implementation replaces manual token entry with seamless OAuth flow using oauth2-proxy v7.6.0 and Traefik ForwardAuth middleware.

Key improvements:
- Hybrid authentication modes: dev, oauth, and bearer for different use cases
- GitLab OIDC integration supporting both gitlab.com and self-hosted instances
- Redis-backed session storage eliminating 4KB cookie size limitations
- Clean API separation: public endpoints at /api/v1/ and authenticated at /api/v1/authenticated/*
- Reusable ForwardAuth middleware for protecting additional Docker applications
- Production-ready Docker Compose setup with Traefik reverse proxy

Technical implementation:
- ForwardAuth pattern validates sessions and injects user headers into backend requests
- Breaking change: moved protected API endpoints to dedicated authenticated namespace
- Frontend auto-detects authentication mode and handles OAuth redirects seamlessly
- CLI tools updated to use new authenticated API namespace
- Comprehensive example setups for both development and production OAuth scenarios

This enables self-service user authentication without manual token management while preserving existing CLI workflows and adding enterprise-grade session management.
- Add #[derive(Default)] to AuthMode enum as suggested by clippy
- Add #[allow(dead_code)] to access_token field for future OAuth implementations
- Format all code with cargo fmt

All tests passing and clippy warnings resolved.
- Replace 'any' type with proper LoginResponse interface in auth flow
- Add proper type annotations for API response handling
- Format code to match project style guidelines
The monitorTask function was calling getTask() which only checked the local
store instead of actually fetching task status from the API. Changed to use
requestTaskDetails() to properly fetch task status updates, enabling the
loading spinner animation when running/stopping apps.
- Remove automatic OAuth redirects from login page, require user button click
- Add auth_mode to /api/v1/info endpoint for cleaner auth mode detection
- Use validate-token endpoint for proper authentication checking
- Remove Traefik auto-redirect middlewares that bypass login page
- Consolidate OAuth/bearer redirect handling to both use /login route

Improves user experience by showing login page with explicit OAuth button
instead of automatic redirects that users might miss.
- Add comprehensive OAuth authentication guide with oauth2-proxy and Traefik
- Remove outdated LOCAL_OAUTH_TESTING.md file
- Update configuration.md with new auth_mode options and environment variables
- Add oauth-authentication.md to docs navigation

The new documentation covers the current OAuth implementation including:
- Three authentication modes (dev, oauth, bearer)
- Route-based protection (/api/v1/authenticated/*)
- Working examples in examples/oauth2-proxy-oauth/
- Complete setup instructions and troubleshooting
- Reduce healthcheck interval from 30s to 10s for faster feedback
- Add 15s start period to prevent failures during startup
- Reduce timeout from 3s to 2s (health endpoint responds in ~20ms)
- Add explicit retries=3 for clarity

This improves Docker Compose startup time and provides more responsive
health status feedback during development and deployment.
…tion

- Implement hybrid OAuth approach (Option C) with temporary session exchange
- Add OAuth session storage with 5-minute expiry for security
- Create `/oauth/exchange` API endpoint for frontend token retrieval
- Update OAuth callback to redirect to frontend with session ID
- Separate API and frontend OAuth callback paths to prevent conflicts
- Modify frontend to handle session exchange flow instead of direct tokens
- Add comprehensive OAuth state management and validation
- Ensure no sensitive tokens appear in URLs or browser history
- Support configurable frontend callback URLs via redirect_uri parameter

This approach provides secure OAuth authentication while maintaining
flexibility for different deployment configurations and prevents
common security issues like token exposure in URLs.
…Gravatar support

This comprehensive refactoring transforms the OAuth implementation from GitLab-specific
to fully OIDC-compliant and provider-agnostic, while adding modern UI enhancements.

**OIDC Standards Compliance:**
- Replace GitLab-specific `gitlab_url` with generic `oidc_issuer_url` configuration
- Update `GitLabUser` to `OidcUser` with OIDC standard fields (`sub`, `preferred_username`)
- Use OIDC `/oauth/userinfo` endpoint instead of provider-specific endpoints
- Support optional OIDC claims with intelligent fallbacks

**Provider Interchangeability:**
- Support GitLab, Auth0, Keycloak, Google, and other OIDC providers
- Provider-agnostic configuration and documentation
- Remove all GitLab-specific references from codebase

**Enhanced Authentication UX:**
- Add reactive user store for immediate UI updates after OAuth login
- Implement session exchange flow to avoid exposing tokens in URLs
- Fix user info display appearing only after page reload
- Add proper error handling and debugging for OAuth flows

**Modern Avatar System:**
- Add Gravatar support with MD5 hashing for user avatars
- Implement smart fallback system: Gravatar → initials → generic avatar
- Create reusable UserAvatar component with multiple sizes and shapes
- Enhanced user dropdown with professional styling and logout icon

**Updated Documentation:**
- Comprehensive OIDC authentication guide with provider examples
- Updated CLI documentation with new command structure
- Provider setup instructions for GitLab, Auth0, Keycloak, and Google
- Migration guide from oauth2-proxy to native OIDC implementation

**Frontend Improvements:**
- Reactive authentication state management with Svelte stores
- DaisyUI integration for consistent avatar styling
- Enhanced user interface with Gravatar images and rich user dropdowns
- Improved error handling and user feedback

**Backend Enhancements:**
- Better OIDC user field handling with meaningful fallbacks
- Improved session management and token validation
- Enhanced OAuth error handling and debugging capabilities

The system now works seamlessly with any OIDC-compliant provider while providing
a modern, professional user experience with Gravatar support and reactive UI updates.
Complete end-to-end OAuth device flow implementation enabling CLI authentication
with OIDC providers like GitLab. This adds native device flow support alongside
the existing web-based OAuth flow.

Key improvements:
- Implement full device flow token polling and exchange in Scotty server
- Add proper OIDC provider integration with device authorization grant
- Fix server info endpoint to use OIDC-compliant field names
- Resolve server URL mismatch between localhost and 127.0.0.1 in token storage
- Update scottyctl to handle device flow authentication properly
- Add comprehensive error handling for OAuth flow states

Technical changes:
- Server: Add exchange_device_code_for_token() method for GitLab token polling
- Server: Store device flow session interval for proper polling cadence
- Server: Update info handler to return oidc_issuer_url instead of gitlab_url
- scottyctl: Fix token storage to use user-provided server URL
- scottyctl: Remove placeholder user info and use actual token response data
- scottyctl: Update OAuth structures to be fully OIDC-compliant

The device flow now supports the complete OAuth 2.0 Device Authorization Grant
flow (RFC 8628) with proper error handling for authorization_pending,
access_denied, and expired_token scenarios.

Tested with GitLab OIDC provider - full authentication and API access working.
- Remove unused variables in frontend components
  - Remove imageLoaded variable from user-avatar component
  - Remove oauthRedirectUrl variable from login page
- Apply Prettier formatting across all frontend files
- Fix arrow function formatting in stores and components
- Improve code readability with consistent indentation

These changes resolve ESLint warnings and ensure consistent code style
across the frontend codebase.
Update all documentation to use correct CLI command syntax:
- Change apps subcommand to app:subcommand format
- Change blueprints list to blueprint:list
- Change notifications add/remove to notify:add/notify:remove

This aligns documentation with actual CLI implementation.
Implement complete test suite for bearer token and OAuth authentication flows:

**Bearer Authentication Tests (14 tests)**
- Valid/invalid/missing token authentication scenarios
- Malformed headers and public endpoint access validation
- Configuration-dependent behavior testing
- Cross-authentication mode validation

**OAuth Authentication Tests (8 tests)**
- Complete OAuth device flow: authorization → token → protected endpoint access
- OAuth web flow: authorization URL generation, callback handling, session exchange
- Mock OAuth provider integration with exact API format matching
- OAuth provider error handling and authorization pending states

**Key Technical Achievements**
- Tests actual Scotty application router with complete middleware stack
- Uses AppState access for OAuth session store manipulation to test complete flows
- Mock OAuth provider with wiremock exactly matches implementation request formats
- Validates end-to-end authentication: auth flow → token → protected API access
- All 22 tests validate real implementation behavior, not isolated components

The AppState approach solves OAuth web flow testing complexity by directly
populating session stores, enabling complete flow validation without complex
callback coordination between HTTP requests.

Dependencies: Add axum-test, tokio-test, wiremock for testing infrastructure.
Removed assert!(true) statements that would be optimized out by the compiler,
as flagged by clippy's assertions_on_constants lint. These assertions provided
no actual test value and were replaced with comments explaining the test flow.
- Add preflight version check using semver to ensure major/minor compatibility
- Create shared ServerInfo and OAuthConfig types in scotty_core for consistency
- Add --bypass-version-check flag for emergency situations
- Update AuthMode enum to support serialization with proper defaults
- Version check runs before commands requiring server connection
- Show clear error messages when versions are incompatible
- Skip version check for auth commands (login/logout) and completion

This prevents backwards compatibility issues by ensuring scottyctl and
scotty server versions are compatible before executing operations.
- Replace direct println! calls with app_context.ui() methods
- Use proper success/failed status methods for better terminal integration
- Reduce excessive emoji usage for cleaner, more professional output
- Maintain colored text for important information (URLs, usernames, servers)
- Ensure consistent UI behavior with status line management
This commit centralizes shared functionality in scotty-core and significantly
improves OAuth error handling with type-safe enums.

## New shared modules in scotty-core:
- Add HTTP client with retry logic and exponential backoff
- Add unified OAuth types with type-safe error enums
- Add version management utilities with compatibility checking
- Move retry logic from scottyctl to shared location

## OAuth improvements:
- Replace string literals with OAuthErrorCode enum for type safety
- Add built-in error descriptions to OAuth error codes
- Update all OAuth handlers to use type-safe error responses
- Maintain backward compatibility with legacy error formats

## HTTP client consolidation:
- Create shared HttpClient with builder pattern and timeout support
- Replace scattered reqwest::Client usage with shared implementation
- Add proper error handling and retry policies across the workspace
- Update OAuth flows to use shared HTTP client

## Version management:
- Add comprehensive version comparison and compatibility utilities
- Update preflight checker to use shared version management
- Add user-friendly version formatting and update recommendations
- Include extensive test coverage for version handling

## Code cleanup:
- Remove unused dependencies (utoipa-axum)
- Fix clippy warnings and improve code consistency
- Update imports to use shared types across workspace
- Maintain full backward compatibility

All tests pass, ensuring no regressions were introduced.
This commit consolidates the OAuth error handling across the Scotty project by:

- Unifying OAuthError and OAuthErrorCode into a single comprehensive error type in scotty-core
- Implementing smart IntoResponse for AppError that returns OAuth-compliant ErrorResponse format for OAuth errors
- Adding proper HTTP status code mappings for all OAuth error types (400, 401, 403, 404, 429)
- Fixing device flow "Server error" issue by improving scottyctl error handling to process all OAuth status codes
- Adding SlowDown variant for handling OAuth2 "slow_down" error during device flow polling
- Maintaining OAuth2 RFC 6749 compliance while simplifying the error architecture
- Updating all components (scotty, scottyctl, scotty-core) to use the unified system with proper error conversions

The device flow now properly handles polling rate limiting and provides specific error messages instead of generic "Server error" responses.
Add role-based access control (RBAC) system using Casbin for granular
permission management across apps and groups.

Key features:
- Group-based app organization (development, staging, production, default)
- Role-based permissions (viewer, operator, developer, admin)
- Per-app permission checks (view, manage, shell, logs, create, destroy)
- Universal default group access for all users via wildcard assignment
- Authorization middleware for API endpoints
- Groups list endpoint (/api/v1/authenticated/groups/list)
- Seamless fallback when authorization config unavailable

This enables secure multi-tenant app management while maintaining
backward compatibility.
- Fix token bounds checking in basic_auth.rs to prevent panics with short tokens
- Implement proper get_user_permissions method in authorization service
- Update authorization middleware to extract app names from API v1 paths
- Fix middleware ordering by using State extractor instead of request extensions
- Update router to use from_fn_with_state for require_permission middleware

These changes resolve authentication failures and "App state not found" errors
in the RBAC authorization system.
- Remove references to legacy api.access_token fallback
- Add migration instructions for existing bearer token installations
- Update PRD to reflect completed Phase 4 enforcement
- Document middleware architecture improvements
- Add warnings about breaking changes for bearer token authentication

The authorization system now requires explicit RBAC assignments for all
bearer tokens, removing the legacy fallback behavior.
Updates scottyctl to check server auth mode before using stored OAuth tokens.
When server is in bearer mode, prioritizes --access-token parameter and
SCOTTY_ACCESS_TOKEN environment variable over stored OAuth credentials.

Resolves authentication failures where scottyctl would attempt to use
invalid OAuth tokens against servers configured for bearer authentication.
- Remove fallback authorization service - RBAC is now required
- Update tests to use actual RBAC configuration instead of fallback
- Add test bearer token to policy configuration
- Remove obsolete test for no-token configuration
- Improve log format with timestamp, level, target, and message
- Clean up telemetry configuration and reduce verbose span output
…er compatibility

The ts-generator was using a relative path (../frontend/src/generated) which
resolved to /frontend/src/generated in Docker instead of /app/frontend/src/generated,
causing the frontend-builder stage to fail with "not found" error.

Now uses CARGO_MANIFEST_DIR to build an absolute path that works in both
local development and Docker build contexts.

- Local: /Users/stephan/dev/rust/scotty/frontend/src/generated
- Docker: /app/frontend/src/generated
@claude
Copy link

claude bot commented Oct 26, 2025

Pull Request Review: OAuth2/OIDC Authentication & RBAC Authorization System

This is an impressive and comprehensive implementation of a production-ready authentication and authorization system. The PR adds OAuth2/OIDC support with RBAC using Casbin across 100 files with 30,246 additions.

Overall Assessment

STRENGTHS:

  • Well-architected multi-mode authentication system (dev/oauth/bearer)
  • Comprehensive RBAC implementation using Casbin
  • Strong OAuth2/OIDC compliance (RFC 6749, RFC 8628 device flow)
  • Excellent test coverage (22 tests covering auth flows)
  • Good separation of concerns across modules
  • Complete frontend integration with reactive stores
  • Proper error handling with type-safe OAuth errors

Code Quality & Best Practices

Excellent Patterns

  1. Authentication Architecture (scotty/src/api/auth_core.rs)

    • Clean separation between auth modes
    • Unified authenticate_user_from_token() entry point
    • Proper token format handling
  2. Authorization Service (scotty/src/services/authorization/service.rs)

    • Well-structured Casbin integration
    • Proper use of Arc<RwLock<>> for thread-safe state management
    • Wildcard scope expansion
  3. OAuth Implementation (scotty/src/oauth/)

    • PKCE for web flow (security best practice)
    • Proper session management with expiration
    • RFC-compliant error codes

Security Concerns

HIGH PRIORITY - Session Storage:
In-memory session storage will lose all sessions on restart and doesn't scale across multiple server instances.

Recommendation: Document this limitation prominently in production deployment docs and plan for Redis/database persistence

MEDIUM PRIORITY - Token Logging:
scotty/src/api/auth_core.rs:96-97 logs partial tokens which could aid attackers if logs are compromised.

Recommendation: Log token length or hash instead of partial token values

MEDIUM PRIORITY - Timing Attack:
scotty/src/api/auth_core.rs:173-180 uses non-constant-time string comparison for token validation

Recommendation: Use constant-time comparison from the subtle crate to prevent timing attacks

Test Coverage

Excellent Coverage

Found 7 test files with comprehensive coverage:

  • bearer_auth_tests.rs - 14 tests for bearer token authentication
  • oauth_flow_tests.rs - 8 comprehensive OAuth flow tests with mock provider
  • Tests use real Scotty router with complete middleware stack
  • Edge cases covered (expired tokens, pending authorization)

Test Gaps

Missing test scenarios:

  1. Concurrent permission checks under load
  2. Session expiration and cleanup
  3. Wildcard scope expansion edge cases
  4. Multiple role assignment permission merging
  5. Authorization config hot-reload

Performance Considerations

Good Practices

  • Uses CachedEnforcer for permission checks
  • Arc for concurrent reads
  • Proper async/await throughout

Potential Bottlenecks

  1. Permission checks on every API request - consider caching user permissions in request extensions
  2. Write locks during assign_user_role block all permission checks
  3. Mutex for sessions could cause contention under high load - consider using DashMap

Documentation

Excellent Documentation

  • Comprehensive OAuth guide (docs/content/oauth-authentication.md)
  • Authorization system docs (docs/content/authorization.md)
  • PRD with architecture details (docs/prds/authorization-system.md)

Documentation Needs

  1. Migration guide for existing deployments
  2. Security hardening production deployment checklist
  3. Troubleshooting guide for common auth/authz issues
  4. Performance tuning documentation

Recommendations Summary

Must Fix Before Merge

  1. Add constant-time token comparison (timing attack mitigation)
  2. Reduce token logging (don't log partial tokens)
  3. Add production deployment warnings for in-memory sessions

Should Fix Soon

  1. Document session storage limitations
  2. Add missing test scenarios (concurrent access, expiration)
  3. Add configuration validation
  4. Add migration guide

Nice to Have

  1. Optimize token prefix handling (use Cow)
  2. Consider DashMap for session stores
  3. Add more OIDC provider examples
  4. Performance tuning documentation

Conclusion

This PR represents excellent engineering work with:

  • Production-ready OAuth2/OIDC implementation
  • Comprehensive RBAC using industry-standard Casbin
  • Good test coverage and documentation
  • Clean architecture with proper separation of concerns

The identified issues are mostly minor security hardening and operational concerns that can be addressed incrementally. The core implementation is solid.

Recommendation: APPROVE with minor security fixes

The authentication system is well-designed and the code quality is high. Address the timing attack and token logging concerns, and this will be production-ready.

Great work!


Generated by Claude Code PR Review

Adds a SvelteKit path alias ($generated) to resolve imports from the
generated TypeScript types directory. This fixes Docker build issues
where Rollup couldn't resolve the relative path "../generated".

Changes:
- Add $generated alias in svelte.config.js pointing to src/generated
- Update imports in containerLogsStore.ts to use $generated
- Update imports in taskOutputStore.ts to use $generated

This approach:
- Works consistently in both local development and Docker builds
- Follows SvelteKit conventions (similar to $lib)
- Provides explicit, maintainable imports
@claude
Copy link

claude bot commented Oct 26, 2025

Pull Request Review: OAuth Authentication & RBAC Authorization System

Overview

This is a comprehensive and well-architected implementation of OAuth/OIDC authentication and Casbin-based RBAC authorization for Scotty. The PR adds 30,249 lines across 245 files, implementing a production-ready authentication system with excellent separation of concerns.

🎯 Major Changes

Authentication System

  • Three authentication modes: dev, oauth, and bearer
  • OIDC-compliant OAuth implementation supporting GitLab, Auth0, Keycloak, Google, etc.
  • Device flow (CLI-friendly) and web authorization flow (browser-friendly)
  • Security best practices: PKCE, CSRF protection, session expiration

Authorization System

  • Apache Casbin integration with cached enforcer
  • Scope-based RBAC (not just per-app granularity)
  • Flexible permission model: view, manage, shell, logs, create, destroy, admin
  • YAML-based configuration with dynamic policy updates

Frontend & CLI

  • Reactive authentication state management
  • Gravatar support with smart fallbacks
  • OAuth device flow in scottyctl
  • Version compatibility checking between CLI and server

✅ Strengths

Security Implementation

  1. PKCE Protection - SHA256 challenge/verifier for authorization code flow (RFC 7636 compliant)
  2. CSRF Protection - Random tokens validated in OAuth callback state parameter
  3. Session Expiration - Appropriate TTLs (5-10 minutes) for OAuth sessions
  4. Bearer Token Format - Uses identifier:tokenname to prevent OAuth/bearer collisions
  5. Fail-Safe Authorization - Denies access when RBAC not properly configured
  6. Comprehensive RBAC - Wildcard scopes, multiple roles per user, permission caching

Code Quality

  1. Excellent separation of concerns - Clean module structure in scotty/src/oauth/ and scotty/src/services/authorization/
  2. Type-safe error handling - OAuthError enum with RFC 6749 compliance
  3. Shared code in scotty-core - HTTP client, version management, OAuth types
  4. Comprehensive documentation - New docs for OAuth and authorization configuration
  5. Good test coverage - Authentication, authorization, and OAuth flow tests

Architecture

  1. Provider-agnostic - Generic OIDC implementation, not GitLab-specific
  2. Backward compatible - Maintains existing bearer token authentication
  3. Flexible middleware chain - Composable auth/authz middleware
  4. Well-documented configuration - Clear examples in examples/native-oauth/

🚨 Critical Issues

1. In-Memory Session Storage (BLOCKER for Production)

Severity: CRITICAL
Impact: Cannot run in multi-instance/load-balanced deployments

Issue: OAuth sessions stored in HashMap within single-instance memory:

pub type DeviceFlowStore = Arc<Mutex<HashMap<String, DeviceFlowSession>>>;
pub type WebFlowStore = Arc<Mutex<HashMap<String, WebFlowSession>>>;
pub type OAuthSessionStore = Arc<Mutex<HashMap<String, OAuthSession>>>;

Consequences:

  • Sessions lost on restart
  • No horizontal scaling support
  • OAuth callbacks may hit different server instances (load balancer) causing session not found errors

Recommendation:

  • Implement Redis-backed session storage for production deployments
  • Add configuration option to choose between in-memory (dev) and Redis (prod)
  • Document single-instance limitation in README

Files affected:

  • scotty/src/oauth/mod.rs:38,51,62
  • scotty/src/oauth/handlers.rs

⚠️ Medium Priority Issues

2. No JWT Signature Verification

Severity: MEDIUM
Impact: Relies on OAuth provider availability for every token validation

Issue: Validates OIDC tokens via HTTP call to provider's /userinfo endpoint instead of offline JWT verification:

let user_url = format!("{}/oauth/userinfo", self.oidc_issuer_url);
let response = self.http_client.get(&user_url).bearer_auth(access_token).send().await?;

Recommendation:

  • Add JWT signature verification using OIDC provider's public keys (JWKS endpoint)
  • Implement offline validation to reduce latency and provider dependency
  • Keep /userinfo validation as fallback

Files affected: scotty/src/oauth/device_flow.rs:221-256


3. No Session Garbage Collection

Severity: MEDIUM
Impact: Memory leak with abandoned authentication attempts

Issue: Expired sessions remain in memory indefinitely. Only explicit cleanup occurs on successful exchange.

Recommendation:

  • Implement periodic cleanup task (e.g., every 5 minutes)
  • Remove sessions where created_at + ttl < now()
  • Add metrics for session store size

4. State Parameter Encoding

Severity: LOW-MEDIUM
Impact: OAuth state structure exposed in URLs

Issue: State parameter concatenates session ID and CSRF token as plain string: state = "session_id:csrf_token"

Recommendation:

  • Base64 encode the entire state parameter before sending
  • Decode on callback before parsing

Files affected: scotty/src/oauth/handlers.rs:182


5. Error Information Disclosure

Severity: LOW-MEDIUM
Impact: Detailed errors may leak internal information

Issue: OAuth validation failures return detailed error messages:

error!("OIDC token validation failed: {} - {}", status, error_text);
return Err(OAuthError::Http(...));

Recommendation:

  • Log detailed errors server-side
  • Return generic error messages to clients
  • Use error codes instead of detailed descriptions

Files affected: scotty/src/oauth/device_flow.rs:233-241


📋 Testing Recommendations

Current Coverage

✅ Bearer authentication tests (14 tests)
✅ OAuth flow tests (8 tests)
✅ Authorization service tests (~150 lines)
✅ RBAC permission checks

Gaps to Address

  • Endpoint-level authorization testing - Test all protected routes with different permission levels
  • Permission escalation scenarios - Verify users cannot exceed granted permissions
  • Scope isolation testing - Ensure users in scope A cannot access apps in scope B
  • State parameter validation edge cases - Malformed state, missing CSRF, etc.
  • OAuth error scenarios - Provider errors, network failures, timeout handling
  • Session expiration tests - Verify expired sessions are rejected
  • Multi-user concurrent access - Race conditions in session stores

🏗️ Code Quality & Best Practices

Excellent

  • ✅ Follows Rust idioms and conventions
  • ✅ Proper error handling with thiserror
  • ✅ Good use of type safety (enum for AuthMode, Permissions, OAuthError)
  • ✅ Async/await throughout with tokio
  • ✅ Dependency injection via AppState
  • ✅ Comprehensive logging with tracing

Minor Improvements

  • Clippy compliance: Appears clean based on commit history showing clippy fixes
  • Formatting: Code is formatted with rustfmt
  • Documentation: Add doc comments to public API functions
  • Dependency versions: All dependencies use workspace versions (good practice)

🔒 Security Checklist

Item Status Notes
PKCE implementation ✅ PASS SHA256 challenge/response
CSRF protection ✅ PASS Random tokens validated
Session expiration ✅ PASS 5-10 minute TTLs
Token storage ⚠️ CONCERN In-memory only
HTTPS enforcement ⚠️ VERIFY Should document requirement
JWT validation ⚠️ MISSING Only userinfo endpoint
Rate limiting ❌ MISSING OAuth endpoints vulnerable to abuse
Token rotation ⚠️ MISSING No refresh token rotation
Audit logging ⚠️ PARTIAL Has logging, not structured audit log
Secret management ✅ PASS Uses secrecy crate

📦 Deployment Considerations

Pre-Production Checklist

  • Implement Redis session storage or document single-instance limitation
  • Add session cleanup mechanism
  • Add rate limiting on OAuth endpoints (especially device flow polling)
  • Document TLS/HTTPS requirement for OAuth callbacks
  • Add health check for OIDC provider connectivity
  • Test multi-instance deployment with load balancer
  • Document RBAC configuration migration for existing deployments
  • Verify OAuth provider setup (redirect URIs, scopes)

Breaking Changes

⚠️ BREAKING: API endpoints restructured to /api/v1/authenticated/* namespace
⚠️ BREAKING: RBAC now required - bearer tokens need explicit assignments in policy.yaml

Migration path: Well documented in authorization docs. Existing installations need to:

  1. Update API endpoints in clients
  2. Add bearer tokens to RBAC configuration
  3. Configure OAuth provider if using OAuth mode

🎯 Recommendations Summary

Before Merge

  1. Decision needed: Document single-instance limitation OR implement Redis sessions
  2. Add issue: Track session garbage collection implementation
  3. Add issue: Track JWT signature verification implementation
  4. Documentation: Add deployment guide with multi-instance considerations

Post-Merge (Follow-up PRs)

  1. Implement rate limiting on OAuth endpoints
  2. Add JWT signature verification for offline validation
  3. Implement refresh token rotation
  4. Add structured audit logging
  5. Extend test coverage per gaps above
  6. Add Prometheus metrics for auth/authz events

🏆 Overall Assessment

Status: ✅ APPROVE WITH RECOMMENDATIONS

This is a high-quality, well-architected implementation that follows OAuth 2.0 and OIDC best practices. The code is clean, well-tested, and properly structured. The Casbin RBAC integration is elegant and flexible.

The main concern is the in-memory session storage, which limits production deployments to single instances. This should either be:

  1. Implemented with Redis before merge (recommended for production-ready state)
  2. Clearly documented as a limitation with a follow-up issue to track

The security implementation is solid with appropriate PKCE/CSRF protections. The missing JWT verification and lack of rate limiting are acceptable for initial release if documented as known limitations.

Recommendation: Merge after deciding on session storage approach and documenting deployment constraints.


📚 Documentation Quality

✅ Excellent documentation added:

  • docs/content/oauth-authentication.md - Comprehensive OAuth guide
  • docs/content/authorization.md - RBAC configuration guide
  • examples/native-oauth/ - Working examples
  • docs/prds/authorization-system.md - Design documentation

Minor improvement: Add troubleshooting section for common OAuth issues (redirect URI mismatch, provider errors, etc.)


Great work on this implementation! The architecture is solid and the code quality is excellent. Address the session storage concern and this will be production-ready. 🚀

stmh added 2 commits October 27, 2025 00:34
The alias was pointing to the directory 'src/generated' which caused Vite/Rollup
to try to load the directory itself instead of the index.ts file, resulting in:
"EISDIR: illegal operation on a directory, read"

Now explicitly points to 'src/generated/index.ts' so Vite can resolve the module correctly.
The build script needs to run 'svelte-kit sync' to generate the .svelte-kit/tsconfig.json
and other required configuration files before vite can build the project.

This fixes the Docker build error where tsconfig.json couldn't find its base config:
"Cannot find base config file './.svelte-kit/tsconfig.json'"

The sync command generates:
- .svelte-kit/tsconfig.json (base TypeScript config)
- .svelte-kit/types/ (SvelteKit type definitions)
- Other SvelteKit-generated files needed for the build
@claude
Copy link

claude bot commented Oct 26, 2025

PR #467 Review: OAuth2/OIDC Authentication & RBAC Authorization System

Executive Summary

This is an excellent and comprehensive implementation of OAuth2/OIDC authentication with Casbin-based RBAC authorization. The PR adds ~30,249 lines implementing a production-grade authentication system with:

Strengths:

  • Complete OIDC-compliant OAuth implementation with PKCE
  • Well-architected RBAC system using Casbin
  • Comprehensive documentation and examples
  • Good test coverage for core flows
  • Proper CSRF protection and security measures
  • Clean separation of concerns across frontend/backend

⚠️ Production Readiness Concerns:

  • In-memory session storage (not production-ready)
  • Hardcoded test credentials in default config files
  • Bearer tokens stored in plaintext
  • Frontend tokens in sessionStorage (XSS vulnerable)
  • Performance issues with token validation

🎯 Overall Assessment

Code Quality: 8.5/10 - Well-structured, clean code with good patterns
Security: 7/10 - Good foundations but needs hardening for production
Test Coverage: 7/10 - Core flows covered, missing edge cases
Documentation: 9/10 - Excellent docs with clear examples

Recommendation: Approve with conditions - Address critical security items before production deployment.


🔒 Security Review

Critical Issues (Must Fix Before Production)

1. In-Memory Session Storage 🔴 HIGH PRIORITY

Location: scotty/src/oauth/mod.rs:37-40

pub type DeviceFlowStore = Arc<Mutex<HashMap<String, DeviceFlowSession>>>;
pub type WebFlowStore = Arc<Mutex<HashMap<String, WebFlowSession>>>;
pub type OAuthSessionStore = Arc<Mutex<HashMap<String, OAuthSession>>>;

Issue: Sessions stored in memory will be lost on server restart, causing active users to lose authentication.

Impact:

  • Session loss on deployments
  • No distributed deployment support
  • Memory leaks with expired sessions

Recommendation:

// Replace with Redis or database backend
pub trait SessionStore: Send + Sync {
    async fn get(&self, key: &str) -> Option<OAuthSession>;
    async fn set(&self, key: &str, value: OAuthSession, ttl: Duration);
    async fn delete(&self, key: &str);
}

Add feature flags for different backends:

  • session-redis - Redis-backed sessions
  • session-db - Database-backed sessions
  • session-memory - In-memory (dev only)

2. Test Credentials in Production Config 🔴 HIGH PRIORITY

Location: config/casbin/policy.yaml:42-48

assignments:
  identifier:test-bearer-token-123:
    - role: admin
      scopes:
        - client-a
        - client-b
        - qa
        - default

Issue: Hardcoded test bearer token with admin access in default production config.

Files affected:

  • config/casbin/policy.yaml
  • config/default.yaml
  • scotty/tests/test_bearer_auth.yaml

Recommendation:

  1. Move test tokens to config/dev.yaml or config/test.yaml
  2. Add validation on startup to reject known test tokens in production
  3. Update documentation to warn against using default config in production
// Add startup validation
fn validate_production_config(config: &AuthConfig) -> Result<()> {
    const TEST_TOKENS: &[&str] = &["test-bearer-token-123", "hello-world"];
    
    for (identifier, _) in &config.assignments {
        if identifier.contains("test-") || TEST_TOKENS.contains(&identifier.as_str()) {
            bail!("Production config contains test credentials: {}", identifier);
        }
    }
    Ok(())
}

3. Bearer Tokens in Plaintext 🟡 MEDIUM PRIORITY

Location: scotty/src/api/auth_core.rs:173-182

fn find_token_identifier(shared_app_state: &SharedAppState, token: &str) -> Option<String> {
    for (identifier, configured_token) in &shared_app_state.settings.api.bearer_tokens {
        if configured_token == token {  // ⚠️ Plaintext comparison
            return Some(identifier.clone());
        }
    }
    None
}

Issue: Bearer tokens stored and compared in plaintext with O(n) complexity.

Recommendation:

// Use bcrypt/argon2 for token hashing
use argon2::{Argon2, PasswordHash, PasswordVerifier};

pub struct TokenStore {
    tokens: HashMap<String, String>,  // identifier -> hashed_token
}

impl TokenStore {
    pub fn verify(&self, identifier: &str, token: &str) -> bool {
        if let Some(hash) = self.tokens.get(identifier) {
            let parsed_hash = PasswordHash::new(hash).ok()?;
            Argon2::default().verify_password(token.as_bytes(), &parsed_hash).is_ok()
        } else {
            false
        }
    }
}

4. Frontend Token Storage in sessionStorage 🟡 MEDIUM PRIORITY

Location: frontend/src/lib/sessionManager.ts:82-88

setOAuthToken(token: string): boolean {
    return this.setItem(STORAGE_KEYS.OAUTH_TOKEN, token);  // ⚠️ XSS vulnerable
}

Issue: OAuth tokens stored in sessionStorage are accessible to XSS attacks.

Recommendation:

  1. Use HTTP-only cookies for token storage (immune to XSS)
  2. Implement Content Security Policy headers
  3. Add CSRF tokens for state-changing operations
// Backend: Set HTTP-only cookie instead of returning token
async fn exchange_session_for_token(/* ... */) -> impl IntoResponse {
    let cookie = Cookie::build("scotty_session", session_token)
        .http_only(true)
        .secure(true)
        .same_site(SameSite::Strict)
        .path("/")
        .finish();
    
    (
        StatusCode::OK,
        [(header::SET_COOKIE, cookie.to_string())],
        Json(user_info)  // Don'\''t return token
    )
}

5. OAuth Token Validation on Every Request 🟡 MEDIUM PRIORITY

Location: scotty/src/api/auth_core.rs:151

match oauth_state.client.validate_oidc_token(token).await {  // ⚠️ Network call every request
    Ok(oidc_user) => { /* ... */ }
}

Issue: Every API request makes a network call to OIDC provider for token validation.

Impact:

  • High latency (network roundtrip per request)
  • OIDC provider rate limits
  • Scalability bottleneck

Recommendation:

// Implement token caching with JTI tracking
pub struct TokenCache {
    cache: Arc<RwLock<HashMap<String, (OidcUser, Instant)>>>,
    ttl: Duration,
}

impl TokenCache {
    pub async fn validate_with_cache(&self, token: &str) -> Result<OidcUser> {
        // Check cache first
        if let Some((user, expiry)) = self.cache.read().await.get(token) {
            if Instant::now() < *expiry {
                return Ok(user.clone());
            }
        }
        
        // Validate with provider
        let user = self.validate_oidc_token(token).await?;
        
        // Cache result
        self.cache.write().await.insert(
            token.to_string(), 
            (user.clone(), Instant::now() + self.ttl)
        );
        
        Ok(user)
    }
}

Additional Security Observations

✅ Good Security Practices Found:

  1. PKCE Implementation (oauth/handlers.rs:199-203)

    • Proper SHA256 PKCE for authorization code flow
    • Challenge/verifier correctly paired
  2. CSRF Protection (oauth/handlers.rs:179-182)

    • State parameter with session ID
    • CSRF token validation on callback
  3. Session Cleanup (oauth/handlers.rs:425-429)

    • Web flow sessions removed after use
    • OAuth sessions are one-time use
  4. CLI Token File Permissions (scottyctl/src/auth/storage.rs:35-42)

    • Token files created with 0o600 (owner-only)
    • Proper Unix file security
  5. Input Validation

    • Authorization header format validation
    • State parameter format checking
    • Session expiration checks

🏗️ Architecture & Code Quality

Strengths

  1. Clean Separation of Concerns

    • Authentication in scotty/src/oauth/ and scotty/src/api/auth_core.rs
    • Authorization in scotty/src/services/authorization/
    • Frontend auth in frontend/src/lib/authService.ts
  2. Proper Error Handling

    • Type-safe OAuth error enums (scotty-core/src/auth.rs)
    • Unified error responses
    • Good error logging
  3. Comprehensive Type Safety

    • CurrentUser, OAuthSession, DeviceFlowSession properly typed
    • Permission enum prevents typos
    • AuthMode enum for mode switching
  4. Casbin Integration

    • Well-structured policy model (config/casbin/model.conf)
    • Proper role/permission separation
    • Wildcard scope expansion (service.rs:343-355)

Areas for Improvement

1. Token Lookup Performance 🟡

Location: auth_core.rs:173-182

// O(n) complexity - should be O(1)
for (identifier, configured_token) in &shared_app_state.settings.api.bearer_tokens {
    if configured_token == token {
        return Some(identifier.clone());
    }
}

Recommendation:

// Pre-build reverse lookup map on startup
pub struct BearerTokenStore {
    token_to_id: HashMap<String, String>,  // token -> identifier (O(1) lookup)
}

2. Authorization Middleware Ordering ✅ GOOD

Location: api/middleware/authorization.rs:30-63

The middleware properly checks:

  1. Authorization is configured
  2. User is authenticated (from request extensions)
  3. App context extraction
  4. Permission checking

Good defensive programming with early returns.


3. Global Permission Check Logic 🟡

Location: services/authorization/service.rs:196-249

pub async fn check_global_permission(&self, user: &str, action: &Permission) -> bool {
    // Checks admin permissions without app context
    // Could allow unintended global access
}

Concern: Admin users bypass app-level checks. Verify this is intentional behavior.

Recommendation: Add audit logging for global permission grants:

if has_permission {
    audit_log::log_global_permission_grant(user, action);
    return true;
}

🧪 Test Coverage

Existing Tests ✅

  1. Bearer Auth Tests (api/bearer_auth_tests.rs)

    • Valid/invalid token scenarios
    • Missing token handling
  2. OAuth Flow Tests (api/oauth_flow_tests.rs)

    • Device flow with mock OIDC provider
    • Uses wiremock for provider simulation
  3. Authorization Tests (services/authorization/tests.rs)

    • Basic permission checking
    • Multi-scope app testing

Missing Test Coverage 🟡

  1. Security Edge Cases

    • CSRF token mismatch scenarios
    • Session timeout edge cases
    • Token replay attacks
    • XSS attack vectors
  2. PKCE Flow

    • Invalid PKCE verifier
    • PKCE challenge tampering
  3. RBAC Edge Cases

    • Wildcard scope expansion
    • Permission merging from multiple roles
    • Conflicting role assignments
  4. Frontend Tests

    • No frontend authentication tests found
    • Missing integration tests for OAuth callback

Recommendation:

// Add security-focused tests
#[tokio::test]
async fn test_csrf_token_mismatch() {
    // Verify CSRF protection works
}

#[tokio::test]
async fn test_expired_session_rejection() {
    // Test session expiration
}

#[tokio::test]
async fn test_token_replay_attack() {
    // Ensure one-time use sessions
}

📚 Documentation Quality

Excellent Documentation ✅

  1. OAuth Setup Guide (docs/content/oauth-authentication.md)

    • Clear flow diagrams
    • Multiple provider examples (GitLab, Auth0, Keycloak, Google)
    • Step-by-step setup instructions
  2. Authorization Guide (docs/content/authorization.md)

    • RBAC concepts well-explained
    • Good configuration examples
    • Clear permission descriptions
  3. Examples (examples/native-oauth/)

    • Complete working examples
    • Docker Compose setup
    • Environment variable templates

Documentation Gaps 🟡

  1. Security Architecture

    • Missing threat model documentation
    • No security best practices section
    • Missing production deployment security checklist
  2. Migration Guide

    • No guide for migrating from bearer to OAuth
    • Missing rollback procedures

Recommendation: Add docs/content/security.md:

# Security Architecture

## Threat Model
- XSS attacks via token storage
- CSRF attacks on state-changing operations
- Token replay attacks
- Session hijacking

## Mitigation Strategies
- HTTP-only cookies (planned)
- CSRF tokens on all mutations
- One-time session exchange
- Short session expiration

## Production Checklist
- [ ] Remove test tokens from config
- [ ] Enable Redis session storage
- [ ] Configure CSP headers
- [ ] Set up rate limiting
- [ ] Enable audit logging

🎨 Code Style & Best Practices

Excellent Patterns ✅

  1. Shared Types in scotty-core (scotty-core/src/auth.rs)

    • DRY principle applied
    • Workspace-wide consistency
    • Version compatibility checking
  2. Builder Pattern (scotty-core/src/http_client.rs)

    • Clean HTTP client creation
    • Configurable retry logic
    • Good defaults
  3. Error Conversion (scotty-core/src/auth.rs)

    • Proper From<OAuthError> implementations
    • HTTP status code mapping
    • RFC 6749 compliance

Minor Style Issues 🟢

  1. Unused #[allow(dead_code)] Attributes

    • CurrentUser.picture and access_token fields marked dead_code
    • Comments explain future use, which is good
  2. TODO Comments (oauth/handlers.rs:556)

    // TODO: Generate a Scotty JWT token instead

    Consider creating GitHub issues for TODOs


🚀 Performance Considerations

Current Performance Issues 🟡

  1. O(n) Token Lookup (mentioned above)
  2. OIDC Validation Network Calls (mentioned above)
  3. Mutex Contention on session stores under load

Recommendation: Benchmark and optimize:

// Use RwLock instead of Mutex for read-heavy workloads
pub type DeviceFlowStore = Arc<RwLock<HashMap<String, DeviceFlowSession>>>;

// Or use concurrent hashmap
use dashmap::DashMap;
pub type DeviceFlowStore = Arc<DashMap<String, DeviceFlowSession>>;

📋 Recommendations Summary

Before Production Deployment (Critical)

  • Replace in-memory session storage with Redis/database
  • Remove test tokens from config/casbin/policy.yaml
  • Implement bearer token hashing (bcrypt/argon2)
  • Add startup validation to reject test credentials in production
  • Document production deployment security requirements

Before Next Release (High Priority)

  • Implement token caching for OAuth validation
  • Replace sessionStorage with HTTP-only cookies
  • Add Content Security Policy headers
  • Optimize token lookup with reverse map (O(1))
  • Add audit logging for authorization decisions

Future Improvements (Medium Priority)

  • Add refresh token flow
  • Implement token revocation endpoint
  • Add security-focused test suite
  • Create frontend authentication tests
  • Add rate limiting for auth endpoints
  • Document threat model and mitigation strategies

Nice to Have (Low Priority)

  • Support for multiple OIDC providers simultaneously
  • Session management UI for admins
  • Audit log viewer in UI
  • Token usage analytics

🎉 Conclusion

This is a high-quality, well-architected implementation of OAuth2/OIDC authentication and RBAC authorization. The code demonstrates:

  • Strong understanding of OAuth2/OIDC standards
  • Proper security practices (PKCE, CSRF protection)
  • Clean architecture and separation of concerns
  • Excellent documentation

The main concerns are production readiness rather than fundamental design issues:

  • In-memory storage needs replacement
  • Test credentials must be removed from production configs
  • Token storage needs hardening

Recommendation: Merge with production deployment blockers documented.

The code is production-quality with the exception of the session storage backend. I recommend:

  1. Merge this PR to get the feature into the codebase
  2. Create follow-up issues for production readiness items
  3. Add warnings in documentation about production deployment requirements
  4. Consider feature flags to gate OAuth/RBAC behind configuration

Great work on this comprehensive feature! 🚀


📎 File References

Key files for review follow-up:

Authentication Core:

  • scotty/src/oauth/handlers.rs - OAuth endpoint handlers
  • scotty/src/api/auth_core.rs - Auth dispatcher
  • scotty/src/oauth/mod.rs - Session storage (needs Redis)

Authorization Core:

  • scotty/src/services/authorization/service.rs - RBAC implementation
  • config/casbin/policy.yaml - Remove test tokens
  • config/casbin/model.conf - Casbin model

Frontend:

  • frontend/src/lib/sessionManager.ts - Switch to HTTP-only cookies
  • frontend/src/lib/authService.ts - Auth orchestration

Configuration:

  • config/default.yaml - Remove test credentials
  • docs/content/oauth-authentication.md - OAuth setup guide

@claude
Copy link

claude bot commented Oct 26, 2025

Pull Request Review: OAuth Authentication & RBAC Authorization System

🎯 Overview

This is a major feature addition implementing OAuth2/OIDC authentication and Role-Based Access Control (RBAC) authorization using Casbin. The PR is substantial (30,250 additions, 100 files) and introduces production-ready security infrastructure.

Overall Assessment: ✅ Approve with Recommendations

The implementation is architecturally sound with good security practices, comprehensive testing, and solid documentation. However, there are several important issues that should be addressed before production deployment.


✅ Strengths

1. Comprehensive Security Implementation

  • ✅ PKCE (Proof Key for Code Exchange) for OAuth flows
  • ✅ CSRF protection with state parameter validation
  • ✅ Proper secret handling using secrecy crate with zeroization
  • ✅ OAuth2 RFC 6749 compliant error codes
  • ✅ Type-safe permission system with Rust enums

2. Well-Structured Architecture

  • ✅ Clear separation between OAuth and RBAC modules
  • ✅ Multiple authentication modes (dev/bearer/oauth)
  • ✅ Flexible RBAC with scopes, roles, and permissions
  • ✅ Casbin integration for policy enforcement
  • ✅ Middleware-based authorization checks

3. Excellent Test Coverage

  • 742 lines of OAuth flow tests (device flow, web flow, token exchange)
  • 406 lines of bearer auth tests (token validation, RBAC)
  • 821 lines of authorization tests (permissions, scopes, roles)
  • ✅ Mock OIDC provider using WireMock
  • ✅ End-to-end authentication flows validated

4. Strong Documentation

  • ✅ Comprehensive OAuth guide (oauth-authentication.md)
  • ✅ Detailed RBAC documentation (authorization.md)
  • ✅ Working examples in examples/native-oauth/
  • ✅ Updated CLI documentation
  • ✅ Configuration examples provided

🔴 Critical Issues (Must Fix)

1. Hardcoded Localhost in OAuth Callback

Location: scotty/src/oauth/handlers.rs:456-459

format!(
    "http://localhost:21342/oauth/callback?session_id={}",
    oauth_session_id
)

Issue: Hardcoded localhost:21342 breaks in production environments and forces HTTP instead of HTTPS.

Impact: OAuth flow will fail in deployed environments, security vulnerability (HTTP).

Recommendation:

// Use base_url from configuration or extract from request
let base_url = state.settings.api.base_url
    .as_ref()
    .unwrap_or(&"http://localhost:21342".to_string());
format!("{}/oauth/callback?session_id={}", base_url, oauth_session_id)

2. No Session Cleanup for Expired OAuth Sessions

Location: scotty/src/oauth/handlers.rs (session storage)

Issue: OAuth sessions stored in Arc<Mutex<HashMap>> are never cleaned up. Expired sessions remain in memory indefinitely.

Impact: Memory leak vulnerability, especially under attack scenarios with repeated device flow polling.

Recommendation: Implement background task to clean expired sessions:

// Spawn cleanup task on startup
tokio::spawn(async move {
    let mut interval = tokio::time::interval(Duration::from_secs(300));
    loop {
        interval.tick().await;
        cleanup_expired_sessions(&oauth_session_store).await;
    }
});

3. No Rate Limiting on OAuth Endpoints

Location: OAuth endpoint handlers (/oauth/device, /oauth/device/token, /api/oauth/callback)

Issue: No rate limiting allows:

  • Unlimited device flow polling (DoS)
  • Brute force attempts on callback endpoint
  • Resource exhaustion attacks

Impact: Service availability and security vulnerability.

Recommendation: Add rate limiting middleware using tower-governor or similar:

use tower_governor::{governor::GovernorConfigBuilder, GovernorLayer};

let governor_conf = Box::new(
    GovernorConfigBuilder::default()
        .per_second(5)
        .burst_size(10)
        .finish()
        .unwrap()
);

// Apply to OAuth routes
oauth_router.layer(GovernorLayer { config: governor_conf })

⚠️ High Priority Issues (Should Fix)

4. Fallback Service Grants Excessive Permissions

Location: scotty/src/services/authorization/fallback.rs:67

Assignment {
    role: "admin".to_string(),  // ❌ Too permissive
    scopes: vec!["default".to_string()],
}

Issue: Legacy bearer tokens get admin role. This violates principle of least privilege.

Impact: Security risk if fallback service is used in production.

Recommendation:

Assignment {
    role: "viewer".to_string(),  // ✅ Minimal permissions
    scopes: vec!["default".to_string()],
}

5. Bearer Token Comparison Not Timing-Safe

Location: scotty/src/api/auth_core.rs:93-99

Issue: Token lookup uses standard == comparison in find_token_identifier(), vulnerable to timing attacks.

Impact: Theoretical timing attack vector for token enumeration.

Recommendation:

use subtle::ConstantTimeEq;

// Compare tokens in constant time
if provided_token.as_bytes().ct_eq(stored_token.as_bytes()).into() {
    return Some(identifier.clone());
}

6. PKCE Verifier Not Zeroized

Location: scotty/src/oauth/handlers.rs (WebFlowSession)

Issue: PKCE verifier stored as base64 string without secrecy wrapper.

Impact: Verifier remains in memory unprotected, potential information leakage.

Recommendation:

use secrecy::{Secret, ExposeSecret};

pub struct WebFlowSession {
    pub csrf_token: String,
    pub pkce_verifier: Secret<String>,  // ✅ Protected
    pub frontend_callback_url: Option<String>,
    pub expires_at: DateTime<Utc>,
}

🟡 Medium Priority Issues (Consider Fixing)

7. Limited Input Validation

Locations: Various endpoints accepting identifiers, role names, scopes

Issue: No max length constraints, no character restrictions (allows special chars).

Impact: Potential for confusion attacks, data storage issues.

Recommendation: Add validation:

fn validate_identifier(id: &str) -> Result<(), ValidationError> {
    if id.len() > 128 { return Err(...); }
    if !id.chars().all(|c| c.is_alphanumeric() || c == '_' || c == '-' || c == ':') {
        return Err(...);
    }
    Ok(())
}

8. No Token Refresh Mechanism

Issue: OAuth access tokens expire but no refresh token flow implemented.

Impact: Users must re-authenticate frequently.

Recommendation: Implement refresh token storage and automatic renewal.

9. OIDC Issuer URL SSRF Risk

Location: OAuth client construction using user-provided oidc_issuer_url

Issue: No allowlist validation on OIDC provider URLs.

Impact: Potential SSRF if configuration is compromised.

Recommendation: Add URL allowlist or restrict to known domains.


📝 Code Quality Observations

Positive Patterns:

  • ✅ Excellent use of type safety (enums for permissions, auth modes)
  • ✅ Comprehensive logging with tracing crate
  • ✅ Proper error handling with custom error types
  • ✅ Good separation of concerns (modules for OAuth, RBAC, middleware)
  • ✅ Constants for magic strings where appropriate

Minor Improvements:

  • ℹ️ Some hardcoded timeouts (600s, 300s, 5s) should be named constants
  • ℹ️ Duplicate model.conf in tests vs config (consolidate)
  • ℹ️ Some #[allow(dead_code)] attributes could be removed if fields are used

🧪 Test Coverage Summary

Area Lines Coverage Status
OAuth Flows 742 Good ✅ Device, web, callback, exchange
Bearer Auth 406 Good ✅ Valid/invalid tokens, RBAC
Authorization 821 Good ✅ Permissions, scopes, roles
Session Cleanup 0 Missing ❌ No tests for expiration
Rate Limiting 0 Missing ❌ No load tests
OIDC Failures 0 Partial ⚠️ Limited provider error scenarios

Recommendation: Add integration tests for:

  • Session cleanup behavior
  • Concurrent request handling
  • OIDC provider timeout/failure scenarios

🔐 Security Checklist

Security Aspect Status Notes
PKCE for OAuth Implemented correctly
CSRF Protection State parameter validation
Secret Zeroization ⚠️ Good for bearer tokens, improve for PKCE
Input Validation ⚠️ Basic validation, needs length limits
Rate Limiting Not implemented
Session Cleanup Not implemented
HTTPS Enforcement ⚠️ Not enforced in config
Timing-Safe Comparison Standard == used
Token Refresh Not implemented

📚 Documentation Quality

Excellent:

  • ✅ Clear OAuth authentication guide with provider examples
  • ✅ Comprehensive RBAC documentation with use cases
  • ✅ Updated CLI documentation
  • ✅ Working configuration examples

Could Improve:

  • Add production deployment checklist (HTTPS, rate limits, monitoring)
  • Document session cleanup expectations
  • Add security best practices section
  • Include OAuth token refresh recommendations

🎯 Recommendations Summary

Before Merge (Critical):

  1. ✅ Fix hardcoded localhost in OAuth callback
  2. ✅ Implement session cleanup mechanism
  3. ✅ Add rate limiting to OAuth endpoints

Post-Merge (High Priority):

  1. Change fallback service to use viewer role
  2. Implement timing-safe token comparison
  3. Add secrecy wrapper to PKCE verifier

Future Enhancements:

  1. Add input validation with length limits
  2. Implement OAuth token refresh flow
  3. Add comprehensive load testing
  4. Document production security requirements

💬 Final Thoughts

This is an excellent implementation of a complex security feature. The code demonstrates:

  • Strong understanding of OAuth2/OIDC standards
  • Good security practices (PKCE, CSRF, secret handling)
  • Comprehensive testing approach
  • Clear documentation

The critical issues are relatively minor fixes that don't undermine the core architecture. Once addressed, this will be a production-ready authentication and authorization system.

Great work on this substantial feature! 🚀


📋 Action Items

  • Fix hardcoded localhost in OAuth callback fallback
  • Implement background task for session cleanup
  • Add rate limiting middleware to OAuth endpoints
  • Review fallback service permissions (consider viewer role)
  • Add timing-safe comparison for bearer tokens
  • Add input validation with max length constraints
  • Document production deployment security requirements

Review conducted with automated code analysis and security best practices evaluation.

stmh added 8 commits October 27, 2025 00:46
Running 'cargo run' from the ts-generator directory with working-directory
was causing path resolution issues in CI. Now runs from workspace root
using explicit package specification.

This ensures CARGO_MANIFEST_DIR resolves correctly and files are generated
in the expected location for subsequent build steps.
Previously, index.ts was a manually maintained file that was gitignored,
which caused CI builds to fail since the file didn't exist in the
repository. Now ts-generator programmatically creates index.ts with all
type guard functions and re-exports, ensuring it's always available
during builds.

This resolves the Docker build failure where the frontend couldn't load
the generated types due to missing index.ts.
Removed the generated index.ts barrel file in favor of inline type
guards directly in store files where needed. This eliminates the need
to generate and maintain a large TypeScript file in Rust.

Changes:
- Remove index.ts generation from ts-generator
- Add inline type guards in containerLogsStore and taskOutputStore
- Remove $generated path alias from svelte.config.js
- Import WebSocketMessage directly from generated files

Benefits:
- No hardcoded TypeScript in Rust strings
- Type guards are co-located with their usage
- Simpler build process
- Easier to maintain
Replaces standard string equality with constant-time comparison using
the subtle crate to prevent timing attacks that could reveal valid
bearer tokens through response time measurements.

The vulnerable code used simple == comparison which short-circuits on
the first character mismatch, allowing attackers to determine correct
characters through timing analysis. This fix ensures all token
comparisons take the same amount of time regardless of match position.

Fixes: #520
Relates to: scotty-39
The login handler was still using standard == comparison for bearer
tokens, vulnerable to the same timing attack fixed in auth_core.rs.

This completes the timing attack mitigation across all token
comparison points in the authentication system.

Addresses PR #524 review feedback
…se URL

The OAuth callback handler used a hardcoded localhost URL for redirecting
users back to the frontend after authentication. This prevented OAuth from
working in production environments accessed via different hostnames.

Changes:
- Added 'frontend_base_url' field to OAuthSettings configuration
- Defaults to 'http://localhost:21342' for backward compatibility
- Updated OAuth callback handler to use configured URL instead of hardcoded value
- Maintains support for per-flow frontend_callback_url override

The frontend_base_url can now be configured in the settings file:
  [api.oauth]
  frontend_base_url = "https://scotty.example.com"

Fixes: #517
Relates to: scotty-36
…-token-comparison

fix(security): use constant-time comparison for bearer token validation
…se URL

The OAuth callback handler used a hardcoded localhost URL for redirecting
users back to the frontend after authentication. This prevented OAuth from
working in production environments accessed via different hostnames.

Changes:
- Added 'frontend_base_url' field to OAuthSettings configuration
- Defaults to 'http://localhost:21342' for backward compatibility
- Updated OAuth callback handler to use configured URL instead of hardcoded value
- Maintains support for per-flow frontend_callback_url override

The frontend_base_url can now be configured in the settings file:
  [api.oauth]
  frontend_base_url = "https://scotty.example.com"

Fixes: #517
Relates to: scotty-36
@claude
Copy link

claude bot commented Oct 27, 2025

Code Review - PR #467 (v0.2.0)

Thank you for this comprehensive PR introducing authentication and authorization to Scotty! This is a significant enhancement that adds OAuth 2.0, bearer token auth, and RBAC via Casbin. The implementation shows solid engineering practices, but there are critical security issues that must be addressed before merging.

🔴 Critical Issues (Must Fix)

1. Plaintext Secrets in Version Control

Files: config/default.yaml, config/local.yaml

Bearer tokens and OAuth client secrets are stored as plaintext in configuration files:

bearer_tokens:
    admin: "gFW5k1fdvYw8iB2xCxw5qZXj5pkP9dga"
    client-a: "j3Xq973L67JVAsQpU4PfZAMKdMsfdXsu"

Risk: Anyone with repository access can see these secrets.

Recommendation:

  • Remove all plaintext secrets from config files
  • Require environment variables for production (e.g., SCOTTY__API__BEARER_TOKENS__ADMIN)
  • Add startup validation that fails if secrets aren't provided via env vars
  • Consider storing only hashed tokens in configs

2. No Token Expiration Mechanism

Files: Throughout auth implementation

Bearer tokens have no expiration - once issued, they're valid indefinitely.

Risk: Stolen tokens can be used perpetually.

Recommendation:

  • Implement JWT tokens with expiration claims
  • Add token refresh mechanism
  • Implement token revocation API

3. Insecure Authorization Fallback

File: scotty/src/app_state.rs:97-109

If Casbin config fails to load, the system falls back to a permissive default:

Err(e) => {
    warn!("Failed to load authorization config...");
    FallbackService::create_fallback_service(settings.api.access_token.clone()).await
}

Risk: Configuration errors could grant unintended access.

Recommendation: Fail closed - refuse to start if authorization can't be loaded.

4. Container Running as Root

File: Dockerfile:78-79

User switching is commented out:

# RUN chown -R $APP_USER:$APP_USER ${APP}
# USER $APP_USER

Risk: Container breakout vulnerabilities.

Recommendation: Uncomment and fix permission issues properly, never run as root.

🟡 High Priority Issues (Should Fix)

5. Missing Input Validation on Admin Endpoints

Files: scotty/src/api/rest/handlers/admin/*.rs

Limited validation on user IDs, role names, scope names could allow injection attacks or invalid data.

Recommendation: Add comprehensive input validation:

fn validate_identifier(s: &str) -> Result<(), &'static str> {
    if s.is_empty() || s.len() > 128 {
        return Err("Invalid length");
    }
    if !s.chars().all(|c| c.is_alphanumeric() || c == '-' || c == '_') {
        return Err("Invalid characters");
    }
    Ok(())
}

6. In-Memory Session Storage Without Cleanup

File: scotty/src/oauth/handlers.rs

OAuth sessions stored in HashMap with no cleanup mechanism.

Risk: Memory leak over time.

Recommendation:

  • Implement background cleanup task for expired sessions
  • Consider Redis for production deployments
  • Use tokio::sync::RwLock instead of std::sync::Mutex

7. Limited Security Test Coverage

Files: Test files throughout

Found basic tests but missing critical security scenarios:

  • No tests for token timing attacks
  • No tests for session expiration
  • No tests for CSRF bypass attempts
  • No tests for authorization bypass attempts

Recommendation: Add comprehensive security test suite.

8. Missing Security Scanning in CI

File: .github/workflows/ci.yml

CI runs tests, fmt, clippy but no security scans.

Recommendation: Add:

- name: Security audit
  run: cargo audit
- name: Dependency checks
  run: cargo deny check

🟢 Positive Observations

Excellent work on several security aspects:

  1. Constant-time comparison: Properly uses subtle::ConstantTimeEq for token comparison (auth_core.rs:181)
  2. Secret masking: Excellent MaskedSecret implementation with partial display
  3. Type safety: Good use of Rust's type system for permissions and errors
  4. OAuth 2.0 compliance: Proper implementation of device flow and web flow
  5. RBAC structure: Well-designed authorization with Casbin

📊 Additional Recommendations

Medium Priority:

  • Implement permission caching for performance (check_permission called on every request)
  • Extract duplicate token validation logic into shared function
  • Add rate limiting on OAuth endpoints
  • Standardize error handling (mix of anyhow::Result and Result<T, AppError>)

Low Priority:

  • Add audit logging for all admin actions
  • Implement security headers middleware
  • Document the Casbin permission model thoroughly
  • Add permission visualization tooling

🎯 Overall Assessment

Risk Level: HIGH

This PR implements valuable functionality with generally good code quality, but critical security issues prevent merging in current state. The secrets management and token expiration issues are particularly concerning for production use.

Recommendation: Address all critical issues before merge. The authentication system should be secure by default, with production-ready secret management.

📝 Testing Recommendations

Before merging, ensure:

  • All secrets removed from config files
  • Token expiration implemented and tested
  • Authorization fails closed when config missing
  • Container runs as non-root user
  • Input validation added with tests
  • Security test suite expanded
  • CI includes cargo audit
  • Documentation updated with security best practices

Overall: Great foundation! The architecture is solid and the OAuth implementation is well-structured. Once the security issues are addressed, this will be a strong addition to Scotty.

Let me know if you'd like me to elaborate on any of these points or provide code examples for the fixes.

stmh added 4 commits October 27, 2025 13:52
Replaces standard string equality with constant-time comparison using
the subtle crate to prevent timing attacks that could reveal valid
bearer tokens through response time measurements.

The vulnerable code used simple == comparison which short-circuits on
the first character mismatch, allowing attackers to determine correct
characters through timing analysis. This fix ensures all token
comparisons take the same amount of time regardless of match position.

Fixes: #520
Relates to: scotty-39
The login handler was still using standard == comparison for bearer
tokens, vulnerable to the same timing attack fixed in auth_core.rs.

This completes the timing attack mitigation across all token
comparison points in the authentication system.

Addresses PR #524 review feedback
Add documentation for the new frontend_base_url configuration option
that allows OAuth to work with production domains instead of being
hardcoded to localhost.

Changes:
- Add frontend_base_url to OAuth configuration examples
- Add SCOTTY__API__OAUTH__FRONTEND_BASE_URL environment variable
- Add "Understanding OAuth URLs" section explaining the difference
  between redirect_url (backend callback) and frontend_base_url
  (frontend application base)
- Include production configuration example
- Update environment variables reference table

This documents the fix for scotty-36 where OAuth callback URLs were
hardcoded to localhost, preventing OAuth from working in production
environments.
@claude
Copy link

claude bot commented Oct 27, 2025

Code Review: Version 0.2.0

This is a major release introducing OAuth authentication, RBAC authorization, unified output system, and observability. This is a comprehensive review with specific findings.


🎯 Executive Summary

Overall Assessment:APPROVE with Important Recommendations

This is a well-architected and substantial release that significantly enhances Scotty's security and operational capabilities. The implementation demonstrates solid engineering practices, good separation of concerns, and comprehensive testing (777 test occurrences across 57 files). However, there are critical security concerns that should be addressed before production deployment.

Changed: 245 files, +30,318 additions, -2,253 deletions


🔒 Critical Security Issues

1. 🔴 CRITICAL: Hardcoded Bearer Tokens in Version Control

Location: config/default.yaml:7-11

bearer_tokens:
    admin: "gFW5k1fdvYw8iB2xCxw5qZXj5pkP9dga"
    client-a: "j3Xq973L67JVAsQpU4PfZAMKdMsfdXsu"
    client-b: "mPFMijuZeEAaus94hWApEHzD8JMhrcRk"
    hello-world: "mv7UuZddtVpAM5c6tWCFejCp1vizuSlX"
    test-bearer-token-123: "9oWZsLsUzFvC01nPqB6nzZpDK6Zlv3KR"

Issue: While these appear to be example tokens, committing ANY tokens to version control is a critical security vulnerability. These could be accidentally deployed to production or leaked via forks.

Recommendation:

  • Replace all token values with placeholder strings: REPLACE_VIA_ENV_VAR
  • Add validation to reject placeholder tokens on startup
  • Document clearly in comments that real tokens MUST be set via environment variables
  • Consider scanning repository history and rotating any tokens that were previously committed

Impact: HIGH - If these are real tokens or get reused, they provide unauthorized access


2. 🟡 MEDIUM: Wildcard User Authorization

Location: config/casbin/policy.yaml:70-73

'*':
  - role: viewer
    scopes:
    - default

Issue: The wildcard '*' assignment grants viewer access to the default scope for ANY user identifier. This could grant unintended access if the authentication layer has a bypass.

Recommendation:

  • Document this behavior clearly in authorization documentation
  • Ensure this is intentional for your use case (e.g., public read-only access)
  • Consider removing this in production environments or restricting to specific authenticated users
  • Add explicit warnings in the policy file comments

Impact: MEDIUM - Potential unauthorized read access if auth is misconfigured


3. 🟡 MEDIUM: Token Storage in sessionStorage

Location: frontend/src/lib/sessionManager.ts

Positive: Unlike previous reviews suggested, this implementation uses sessionStorage (not localStorage), which is significantly more secure:

  • ✅ Tokens are cleared when the browser tab closes
  • ✅ Not shared across browser tabs
  • ✅ Limited lifetime exposure

Remaining Concerns:

  • Still vulnerable to XSS attacks (can't avoid with client-side storage)
  • No explicit token expiration handling in the storage layer

Recommendations:

  • Add Content Security Policy (CSP) headers to mitigate XSS
  • Implement token expiration validation when retrieving from storage
  • Consider adding integrity checks (e.g., HMAC) if feasible
  • Document the XSS risk in security documentation

Impact: MEDIUM - Acceptable for many use cases, but requires XSS prevention


🐛 Code Quality Issues

4. 🟡 MEDIUM: Panic-Prone Code with Multiple unwrap() Calls

Found: 20 files with .unwrap() calls (searched specifically for production code)

Example: scotty/src/settings/config.rs:179

let settings: Settings = builder.build().unwrap().try_deserialize().unwrap();

Issue: Using unwrap() in configuration loading can cause application panics instead of graceful error handling.

Recommendation:

let settings: Settings = builder.build()
    .context("Failed to build configuration")?
    .try_deserialize()
    .context("Failed to deserialize settings")?;

Files to Review:

  • scottyctl/src/commands/apps/mod.rs
  • scottyctl/src/utils/*.rs
  • scotty/src/oauth/device_flow.rs
  • scotty/src/oauth/handlers.rs

Impact: MEDIUM - Can cause crashes instead of error messages


5. 📝 LOW: Outstanding TODO Items

Found 4 TODO items in production code:

  1. scottyctl/src/api.rs:133 - Token expiration check and refresh logic
  2. scottyctl/src/commands/auth.rs:173 - Token refresh implementation
  3. scotty/src/api/rest/handlers/admin/assignments.rs:211 - Remove assignment method ⚠️
  4. scotty/src/oauth/handlers.rs:556 - JWT token generation instead of OAuth token

Recommendation:

Impact: LOW-MEDIUM - Some features incomplete but non-blocking


✅ Excellent Implementation Aspects

Architecture & Design

  • Clean separation: Authentication vs authorization properly layered
  • Flexible auth modes: Dev, OAuth, and Bearer with proper mode detection
  • RBAC with Casbin: Industry-standard authorization engine
  • Type safety: Excellent use of Rust's type system and TypeScript generation

Security Strengths

  • OAuth with PKCE: Proper authorization code flow implementation
  • Proper secret handling: Uses secrecy crate with zeroize
  • Error handling: Comprehensive error types without leaking sensitive data
  • Permission middleware: Deny-by-default authorization
  • OIDC validation: Native token validation without proxy dependencies
  • sessionStorage usage: Better than localStorage for tokens

Code Organization

  • Modular crates: scotty-core, scotty-types, ts-generator
  • Comprehensive testing: 777 test occurrences across 57 files
  • Integration tests: OAuth flows, bearer auth, WebSocket messaging
  • Documentation: PRDs, technical docs, and inline documentation

Features

  • Unified output system: Proper time synchronization for logs/output
  • Observability stack: OpenTelemetry, Grafana, 40+ metrics
  • WebSocket streaming: Real-time logs and task output
  • Multi-tenant scopes: Proper app isolation

📊 Test Coverage

Strong test coverage found:

  • 57 files with test code
  • 777 test occurrences throughout codebase
  • Dedicated test files:
    • scotty/src/api/bearer_auth_tests.rs
    • scotty/src/api/oauth_flow_tests.rs
    • scotty/src/api/rest/handlers/login_test.rs
    • scotty/src/services/authorization/tests.rs
    • Multiple integration tests for Docker services

Recommendations:

  • ✅ Backend testing is excellent
  • Consider adding frontend component tests for Svelte components
  • Add E2E tests for complete auth flows with real browsers

🚀 Performance Considerations

Good Practices Found

  • TimedBuffer batching: Reduces lock contention 20-100x (per docs)
  • Async/await: Proper use of Tokio for concurrent operations
  • WebSocket streaming: Efficient real-time data delivery

Optimization Opportunities

  • Permission caching: Consider caching user permissions per session rather than fetching on every request
  • RBAC policy loading: Cache Casbin policies in memory after initial load
  • Connection pooling: Ensure database/storage connections are pooled

Impact: LOW - Current implementation should perform well, optimizations are nice-to-haves


📋 Action Items Summary

Before Production Deployment (CRITICAL)

  1. 🔴 Replace hardcoded bearer tokens in config/default.yaml with placeholders
  2. 🔴 Review and document wildcard authorization policy
  3. 🟡 Add CSP headers to mitigate XSS risks
  4. 🟡 Replace unwrap() calls in configuration and error-prone code

Follow-Up Work (Important but Non-Blocking)

  1. 📝 Create GitHub issues for all TODO items
  2. 📝 Implement assignment removal functionality
  3. 📝 Add token refresh logic for CLI
  4. 📝 Consider permission caching for performance

Nice-to-Haves

  1. Add frontend component tests
  2. Implement E2E authentication tests
  3. Add permission caching optimization
  4. Document security threat model

🎓 Documentation Quality

Excellent documentation found:

  • docs/content/oauth-authentication.md - Comprehensive OAuth guide
  • docs/content/authorization.md - RBAC documentation
  • docs/content/observability.md - Observability setup
  • docs/prds/authorization-system.md - Design rationale
  • docs/prds/unified-output-system.md - Technical decisions
  • README.md - Updated with new features and security notes

Recommendation: Add a SECURITY.md file documenting:

  • Token management best practices
  • Environment variable requirements
  • Security considerations for different deployment scenarios

🏁 Final Recommendation

Verdict:APPROVE with conditions

This is an excellent release that represents a major milestone for Scotty. The architecture is sound, the code quality is high, and the test coverage is comprehensive.

Before merging to production:

  1. Address the critical security issue (hardcoded tokens)
  2. Review and document the wildcard authorization policy
  3. Add CSP headers for XSS protection

The identified issues are typical for a major authentication overhaul and are straightforward to fix. The benefits of this release far outweigh the issues.

Estimated remediation time: 2-4 hours for critical items

Confidence: High - This is production-ready after addressing the critical security items.


Great work on this release! The authentication and authorization system is well-designed and the observability features are impressive. 🎉

stmh added 3 commits October 27, 2025 14:30
The login handler was still using standard == comparison for bearer
tokens, vulnerable to the same timing attack fixed in auth_core.rs.

This completes the timing attack mitigation across all token
comparison points in the authentication system.

Addresses PR #524 review feedback
…se URL

The OAuth callback handler used a hardcoded localhost URL for redirecting
users back to the frontend after authentication. This prevented OAuth from
working in production environments accessed via different hostnames.

Changes:
- Added 'frontend_base_url' field to OAuthSettings configuration
- Defaults to 'http://localhost:21342' for backward compatibility
- Updated OAuth callback handler to use configured URL instead of hardcoded value
- Maintains support for per-flow frontend_callback_url override

The frontend_base_url can now be configured in the settings file:
  [api.oauth]
  frontend_base_url = "https://scotty.example.com"

Fixes: #517
Relates to: scotty-36
…k-url

fix(oauth): replace hardcoded localhost with configurable frontend base URL
@claude
Copy link

claude bot commented Oct 27, 2025

Code Review: PR #467 - Version 0.2.0

Overview

This is a substantial release introducing major new features across authentication, authorization, observability, and frontend improvements. The PR includes:

  • 246 files changed
  • 30,355 additions / 2,253 deletions
  • 100 commits
  • Three major feature areas: OAuth/OIDC authentication, Casbin-based RBAC authorization, and comprehensive observability stack

Overall Assessment

Status: ✅ APPROVE WITH RECOMMENDATIONS

This PR demonstrates high-quality engineering with well-architected systems. The authorization implementation is particularly impressive, and the observability stack is production-ready. However, there are critical security issues that should be addressed before production deployment.


🎯 Major Features Review

1. Authorization System (Casbin RBAC)

Strengths:

  • ✅ Excellent architecture using Casbin for scope-based access control
  • ✅ Clean separation of concerns with roles, scopes, and permissions
  • ✅ Deny-by-default security model
  • ✅ Comprehensive documentation (docs/content/authorization.md)
  • ✅ Proper constant-time comparison for bearer token validation (prevents timing attacks)
  • ✅ Flexible permission model: view, manage, shell, logs, create, destroy

Code Quality:

  • Well-structured modules (scotty/src/authorization/)
  • Good error handling with clear user-facing messages
  • Type-safe permission system using enums

2. OAuth/OIDC Authentication

Strengths:

  • ✅ PKCE implementation for enhanced security (SHA256)
  • ✅ Support for both device flow and web flow
  • ✅ Multi-mode authentication (dev, bearer, oauth)
  • ✅ Clean integration with frontend
  • ✅ Comprehensive documentation (docs/content/oauth-authentication.md)

⚠️ CRITICAL Security Issues:

  1. JWT Signature Validation Missing (scotty/src/oauth/device_flow.rs)

    • Currently only validates by calling userinfo endpoint
    • Does NOT verify JWT signature, expiration, audience, or issuer
    • Risk: Tokens could be forged or expired
    • Recommendation: Implement proper OIDC token validation using JWKS
    // NEEDS:
    // - JWT signature verification using provider's JWKS
    // - exp (expiration) claim validation
    // - aud (audience) claim validation
    // - iss (issuer) claim validation
  2. Token Validation Endpoint Doesn't Validate (scotty/src/rest/handlers/login.rs)

    pub async fn validate_token_handler() -> impl IntoResponse {
        // Simply returns success without actual validation!
        let json_response = serde_json::json!({"status": "success"});
        Json(json_response)
    }
    • Risk: Frontend cannot verify token validity
    • Recommendation: Implement actual token validation logic
  3. PKCE Verifier Storage (scotty/src/oauth/mod.rs)

    • Comment indicates PKCE verifier not stored securely
    • Currently in-memory HashMap without persistence
    • Risk: Lost on restart, potential race conditions
    • Recommendation: Use encrypted session storage with short TTL (15-30s)
  4. Session Storage Vulnerability (frontend/src/lib/sessionManager.ts)

    • Tokens stored in sessionStorage (vulnerable to XSS)
    • Risk: XSS attacks can steal tokens
    • Recommendation: Use httpOnly, Secure cookies with SameSite=Strict
  5. No Rate Limiting on Device Code Polling

    • Unlimited polling attempts on device flow
    • Risk: Brute force attacks on device codes
    • Recommendation: Implement exponential backoff and max retries

3. Observability Stack

Strengths:

  • Excellent implementation of OpenTelemetry with metrics and traces
  • ✅ Pre-configured Grafana dashboards for monitoring
  • ✅ Prometheus-compatible metrics (40+ metrics covering all subsystems)
  • ✅ Comprehensive documentation (docs/content/observability.md)
  • ✅ Clean architecture: Scotty → OTEL Collector → VictoriaMetrics/Jaeger → Grafana
  • ✅ Low resource usage (180-250MB total)
  • ✅ Flexible backend support (can swap to Prometheus, Thanos, etc.)

Metrics Coverage:

  • Log streaming (active streams, throughput, errors)
  • Shell sessions (active connections, timeouts)
  • WebSocket connections and message rates
  • Task execution and output streaming
  • HTTP server performance by endpoint
  • Memory usage (RSS and virtual)
  • Application fleet metrics
  • Tokio async runtime health

This is production-ready and well-documented.

4. Frontend Improvements

Strengths:

  • ✅ New unified output component for logs and task output
  • ✅ WebSocket store with automatic reconnection
  • ✅ Permission-based UI rendering
  • ✅ User avatar with Gravatar support
  • ✅ Clean session management abstraction
  • ✅ OAuth callback handling

Code Quality:

  • TypeScript types are well-defined
  • Good separation of concerns (stores, services, components)
  • Error handling implemented

⚠️ Security Concerns:

  • Token storage in sessionStorage (see OAuth section above)
  • Consider implementing Content Security Policy (CSP) headers

🔒 Security Analysis

Critical Issues (Must Fix Before Production)

  1. JWT Validation - Implement proper OIDC token signature and claims validation
  2. Token Storage - Migrate from sessionStorage to httpOnly cookies
  3. Token Validation Handler - Implement actual validation logic
  4. PKCE Storage - Secure encrypted storage for PKCE verifier

High Priority

  1. Rate Limiting - Add rate limiting for device code polling and login attempts
  2. CSRF Protection - Implement CSRF tokens for POST endpoints (admin operations)
  3. Token Refresh - Implement token rotation with refresh tokens

Medium Priority

  1. App Name Validation - Add whitelist validation (alphanumeric, dash, underscore)
  2. WebSocket Message Limits - Add size limits to prevent DoS
  3. Persistent Session Storage - Replace in-memory HashMap with Redis for production

Positive Security Findings ✅

  • Excellent secret handling: MaskedSecret with zero-on-drop is production-ready
  • Constant-time comparisons: Prevents timing attacks on bearer tokens
  • Comprehensive logging: Good audit trail for security events
  • Type safety: Rust prevents many vulnerability classes
  • No SQL injection risk: Uses typed APIs, no raw SQL
  • No command injection risk: Uses Bollard SDK properly

📊 Code Quality Assessment

Rust Backend

Strengths:

  • Clean module organization
  • Good error handling with context
  • Type-safe APIs throughout
  • Comprehensive test coverage appears present
  • Clear separation of concerns

Minor Issues:

  • Some TODO comments should be addressed (scotty/src/oauth/mod.rs)
  • Consider adding more inline documentation for complex authorization logic

Frontend (TypeScript/Svelte)

Strengths:

  • Good use of Svelte stores for state management
  • Clean component structure
  • Type definitions are comprehensive

Suggestions:

  • Consider adding unit tests for auth logic
  • Add JSDoc comments for complex functions
  • Implement stricter TypeScript config (strict: true)

🧪 Testing & Quality

Observations:

  • Test files not visible in PR (external to diff)
  • Should verify test coverage for:
    • Authorization permission checks
    • OAuth flow edge cases (expired codes, invalid state)
    • Bearer token validation
    • Scope membership and wildcards

Recommendations:

  • Add integration tests for complete OAuth flows
  • Add tests for authorization denial scenarios
  • Test concurrent WebSocket connections
  • Test PKCE challenge/verifier pairs

📝 Documentation

Strengths:

  • ✅ Excellent documentation for all major features
  • ✅ Clear examples in authorization.md and oauth-authentication.md
  • ✅ Comprehensive observability setup guide
  • ✅ Migration guides for existing installations
  • ✅ Security best practices documented

Suggestions:

  • Add security considerations section to each doc
  • Document the critical security issues noted above
  • Add troubleshooting section for common OAuth errors
  • Include example curl commands for testing authorization

🚀 Performance Considerations

Positive:

  • Casbin enforcer with caching for permission checks
  • Async/await throughout (non-blocking)
  • WebSocket for real-time updates (efficient)
  • VictoriaMetrics for low-resource metrics storage

Potential Concerns:

  • Bearer token lookup is O(n) - consider HashMap for O(1)
  • In-memory session storage doesn't scale horizontally
  • WebSocket store could accumulate memory without cleanup

Recommendations:

  • Add metrics for authorization check latency
  • Implement session cleanup scheduler
  • Consider connection limits for WebSockets

🔄 Breaking Changes

⚠️ Important for Existing Users:

  1. Bearer Token Authentication Changed

    • Old: api.access_token in config
    • New: Must add tokens to authorization assignments
    • Action Required: Migration documented in authorization.md
  2. Authorization Required

    • All apps assigned to scopes (default scope if not specified)
    • Users/tokens need explicit role assignments
    • Action Required: Create initial policy.yaml
  3. OAuth Mode is New

    • Separate from previous oauth2-proxy setup
    • Action Required: Update redirect URLs if migrating

These are well-documented in the migration sections.


📋 Recommendations Summary

Before Merging

  • Add TODO tracking for critical security issues
  • Update README with security warnings for OAuth
  • Add SECURITY.md documenting known issues

Before Production Deployment

  • Fix JWT validation (CRITICAL)
  • Fix token validation handler (CRITICAL)
  • Implement httpOnly cookie storage (HIGH)
  • Add CSRF protection (HIGH)
  • Secure PKCE storage (CRITICAL)

Next Release (0.2.1)

  • Add rate limiting
  • Implement token refresh
  • Add input validation
  • Switch to persistent session storage
  • Add integration tests

🎉 Highlights

What's Really Well Done:

  1. Secret Handling - The MaskedSecret implementation is excellent and shows security awareness
  2. Authorization Architecture - Casbin integration is clean and powerful
  3. Observability Stack - Production-ready and comprehensive
  4. Documentation - Some of the best documentation I've seen in a PR
  5. Code Organization - Clean module structure and separation of concerns

Final Recommendation

APPROVE with conditions:

This is high-quality work that significantly enhances Scotty's capabilities. The authorization system is well-architected, and the observability stack is production-ready. However, the OAuth/OIDC implementation has critical security gaps that must be addressed.

Recommendation:

  • ✅ Merge to main if this is a development branch
  • ⚠️ DO NOT deploy to production until critical security issues are resolved
  • 📝 Create issues for all security items listed above
  • 🔒 Consider a security audit before production OAuth deployment

Confidence Level: High - The foundations are solid, but security hardening is essential.


Detailed Issue Tracker

Would you like me to create separate GitHub issues for each security concern? I can draft them with:

  • Detailed descriptions
  • Code references
  • Proposed solutions
  • Priority labels

Great work on this release! The overall direction is excellent. 🚀


Review generated by Claude Code based on PR diff, documentation, and codebase analysis.

stmh added 3 commits October 27, 2025 18:17
Add automated session cleanup to prevent memory leaks and implement
full observability for OAuth authentication flows.

Session Cleanup (scotty-37):
- Add periodic cleanup task running every 5 minutes via clokwerk scheduler
- Create generic ExpirableSession trait for all session types
- Clean DeviceFlowStore, WebFlowStore, and OAuthSessionStore
- Prevent unbounded memory growth and DoS attacks
- Include comprehensive unit tests for cleanup functionality

Metrics Implementation (scotty-41):
- Add 10 new OAuth metrics to instruments.rs:
  * Session gauges: device_flow, web_flow, oauth sessions (sampled every 30s)
  * Flow counters: device flows, web flows, flow failures
  * Token validation: total, duration histogram, failures
  * Cleanup counter: expired sessions cleaned
- Create dedicated oauth/metrics.rs module with clean API
- Instrument device_flow.rs and handlers.rs with minimal code changes
- Record token validation timing and success/failure rates

Grafana Dashboard:
- Add "OAuth Authentication" section with 9 panels
- Session monitoring: 4 gauge panels for active sessions
- Flow rates: device/web flow rates with failure tracking
- Performance: token validation duration (p50/p95/p99)
- Error rates: validation and flow error percentages
- Color-coded thresholds and stacked time series

Benefits:
- Fixes memory leak vulnerability (sessions never expired)
- Prevents DoS attacks via session exhaustion
- Full observability for authentication health and performance
- Enables proactive monitoring and alerting
- Minimal impact on business logic (single-line instrumentation)

Relates to: scotty-37, scotty-41
…and-metrics

feat(oauth): implement session cleanup and comprehensive monitoring
@claude
Copy link

claude bot commented Oct 27, 2025

PR Review: Version 0.2.0

Overview

This is a substantial release adding three major features: OAuth authentication, RBAC authorization with Casbin, and observability with OpenTelemetry. The PR includes 31,475 additions across 100+ files.

Overall Assessment: The implementation is well-structured with good separation of concerns. However, the PR size makes thorough review challenging. Future releases should consider breaking features into smaller, reviewable chunks.


🎯 Positive Highlights

Architecture & Design

  • Excellent separation of concerns in the frontend authentication layer (authService.ts, sessionManager.ts, sessionStore.ts)
  • Strong type safety with comprehensive TypeScript types and Rust utoipa schemas
  • Well-documented authorization system with clear documentation in docs/content/authorization.md
  • Good error handling in OAuth flow with RFC 6749-compliant error codes

Security

  • Proper token validation with server-side validation endpoints
  • Secure session management using sessionStorage (appropriate for web apps)
  • OAuth implementation follows best practices with state validation
  • Authorization model using Casbin is a solid choice for RBAC

Code Quality

  • Consistent error handling with custom error types (OAuthError)
  • Good use of Rust features: proper derive macros, type safety with secrecy crate
  • Frontend store architecture is clean and maintainable

⚠️ Issues & Concerns

Critical Issues

1. Security: Hardcoded Credentials in Policy File

Location: config/casbin/policy.yaml:63-68

identifier:test-bearer-token-123:
  - role: admin
    scopes:
      - client-a
      - client-b
      - qa
      - default
  • ⚠️ Do not commit test bearer tokens to the repository, even if they're examples
  • This should be removed or moved to a .example file
  • Document how to properly configure bearer tokens without exposing them in version control

2. Potential Token Leakage in Console Logs

Location: frontend/src/stores/webSocketStore.ts:85-86

console.log(
    'Authenticating WebSocket with token (first 8 chars):',
    token.substring(0, 8) + '...'
);
  • Even partial token logging could aid attackers
  • Consider removing in production or using proper debug flags
  • Similar issue in multiple files with debug logging

3. Missing CSRF Protection

Location: frontend/src/lib/authService.ts:49-56

The login endpoint doesn't appear to use CSRF tokens. While credentials: 'include' is set, there's no CSRF token validation visible in the frontend code. Ensure the backend has CSRF protection for state-changing operations.

High Priority Issues

4. Inconsistent Error Handling in OAuth Callback

Location: frontend/src/lib/authService.ts:104-108

const errorData: OAuthErrorResponse = await response.json();
return {
    success: false,
    error: `Authentication failed: ${errorData.error_description || errorData.error}`
};
  • Error responses might not always be JSON (network errors, 5xx errors)
  • Need try-catch around JSON parsing
  • Should handle unexpected response formats gracefully

5. WebSocket Authentication Race Condition

Location: frontend/src/stores/webSocketStore.ts:242-254

The WebSocket connects and immediately calls authenticateWebSocket() in the onopen handler. However, there's a potential race condition:

  • authenticateWebSocket() calls get(isAuthenticated)
  • The session might not be fully initialized yet
  • Consider adding explicit initialization checks or promise-based readiness

6. Docker Image Security: Running as Root

Location: Dockerfile:78-79

# RUN chown -R $APP_USER:$APP_USER ${APP}
# USER $APP_USER
  • These lines are commented out, meaning the container runs as root
  • This is a security risk - containers should run as non-root users
  • Uncomment these lines or document why root access is required

Medium Priority Issues

7. Missing Input Validation

Location: scotty-core/src/admin/requests.rs

While the request structs are well-defined, there's no visible validation for:

  • Scope name format/length (line 9)
  • Role name format/length (line 20)
  • Permission string validation (line 25)
  • User ID format validation (line 34)

Consider adding validation:

#[cfg_attr(feature = "utoipa", schema(min_length = 1, max_length = 64, pattern = "^[a-z0-9-]+$"))]
pub name: String,

8. Gravatar URL Not Using HTTPS Exclusively

Location: frontend/src/lib/gravatar.ts:26

The Gravatar URLs use https:// which is good, but consider adding a comment about why HTTP is never used (security, mixed content).

9. OAuth Token Refresh Not Implemented

Location: scotty-core/src/auth/oauth_types.rs:29

/// Optional refresh token
pub refresh_token: Option<String>,

The refresh_token field exists but there's no visible implementation for token refresh. Users will be logged out when tokens expire. Consider implementing refresh token flow for better UX.

10. Session Storage Fallbacks

Location: frontend/src/lib/sessionManager.ts:26-28

private isAvailable(): boolean {
    return browser && typeof sessionStorage !== 'undefined';
}

When sessionStorage is unavailable, operations silently fail. Consider:

  • Warning users when sessionStorage is disabled
  • Providing a fallback mechanism or clear error messaging

11. Casbin Policy Missing Audit Trail

Location: config/casbin/policy.yaml

The policy file has created_at timestamps but no:

  • updated_at fields
  • created_by / updated_by fields
  • Version tracking

For production authorization systems, audit trails are critical for compliance and debugging.

Low Priority / Code Quality

12. TypeScript ts-expect-error Usage

Location: frontend/src/lib/authService.ts:139

// @ts-expect-error - Type-safe routing doesn't support dynamic redirect paths
await goto(resolve(redirectTo));

Consider fixing the type issue properly rather than suppressing it, or document why this is the only approach.

13. Duplicate Debug Logging

Multiple files have excessive console logging that should use a proper logging framework with levels:

  • frontend/src/stores/webSocketStore.ts (lines 77-96)
  • frontend/src/lib/authService.ts (line 75)

Consider using a logging library with configurable log levels.

14. Magic Numbers

Location: frontend/src/stores/webSocketStore.ts

const MAX_RECONNECT_ATTEMPTS = 5;
const RECONNECT_DELAY_BASE = 1000;
const PING_INTERVAL = 30000;

These are well-named constants, but consider making them configurable via environment variables for different deployment scenarios.

15. Missing Tests

No test files are visible in the PR. For a security-critical release adding authentication and authorization, comprehensive tests are essential:

  • Unit tests for auth flows
  • Integration tests for OAuth callback handling
  • Tests for authorization rules
  • WebSocket authentication tests

🔍 Performance Considerations

1. WebSocket Reconnection Exponential Backoff

Location: frontend/src/stores/webSocketStore.ts:220-222

function getReconnectDelay(attempts: number): number {
    return Math.min(RECONNECT_DELAY_BASE * Math.pow(2, attempts), 30000);
}

✅ Good implementation with capped maximum delay.

2. Session Validation on Every Auth Check

Location: frontend/src/lib/authService.ts:165-167

const isValid = await sessionStore.validateCurrentToken();

Making a network request for every auth check could be expensive. Consider:

  • Caching validation results with TTL
  • Using JWT with client-side expiry checking
  • Only validating on critical operations

3. Casbin Performance

The PR enables Casbin's cached feature (good!), but ensure:

  • Cache invalidation works correctly when policies change
  • Performance testing with large policy sets
  • Consider adding metrics for authorization check latency

📊 Test Coverage Gaps

Missing test coverage for:

  1. OAuth device flow (scotty-core/src/auth/oauth_types.rs:5-17)
  2. Session store state transitions
  3. WebSocket authentication failure scenarios
  4. Authorization policy enforcement
  5. Token expiration handling
  6. Concurrent session handling
  7. RBAC permission evaluation edge cases

🏗️ Architectural Concerns

1. PR Size

This PR is too large for effective review (31k+ additions). Consider for future:

  • Breaking into feature PRs: OAuth, Authorization, Observability
  • Each feature should be independently reviewable
  • Easier to bisect issues, revert if needed

2. Migration Path

For existing deployments upgrading to 0.2.0:

  • Is there a migration guide for adding authorization?
  • What happens to existing sessions on upgrade?
  • Are there breaking API changes?
  • Database/config migration required?

3. Observability Integration

The observability features (OpenTelemetry) are added but:

  • Are there metrics for auth failures?
  • Are authorization decisions logged?
  • What's the performance impact of tracing?

📝 Documentation

Well documented:

  • ✅ Authorization system (docs/content/authorization.md)
  • ✅ OAuth authentication (docs/content/oauth-authentication.md)
  • ✅ Observability setup (docs/content/observability.md)

Needs documentation:

  • ❌ Migration guide for existing deployments
  • ❌ Security best practices for production deployments
  • ❌ Bearer token generation and rotation
  • ❌ WebSocket authentication troubleshooting
  • ❌ OAuth provider configuration examples (beyond the example)

🎯 Recommendations

Before Merge

  1. MUST FIX: Remove identifier:test-bearer-token-123 from config/casbin/policy.yaml
  2. MUST FIX: Uncomment the USER directive in Dockerfile or document why root is needed
  3. MUST FIX: Add try-catch around JSON parsing in OAuth error handling
  4. SHOULD FIX: Reduce console.log of token information
  5. SHOULD ADD: Basic integration tests for auth flows

Post-Merge (Next Release)

  1. Implement token refresh mechanism
  2. Add comprehensive test suite
  3. Add authorization metrics and alerting
  4. Implement CSRF protection verification
  5. Add input validation for authorization resources
  6. Create migration guide
  7. Performance testing with large policy sets

Process Improvements

  1. Break down large features into reviewable PRs (<5000 lines)
  2. Add pre-merge checklist for security features
  3. Require tests for authentication/authorization code
  4. Security review process for auth-related changes

Summary

This is a significant and well-architected release that adds critical features. The code quality is generally high with good separation of concerns and type safety.

Key concerns:

  • Security issues with hardcoded test credentials and logging
  • Docker security (running as root)
  • Missing tests for security-critical features
  • PR size makes thorough review difficult

Recommendation: Address the critical security issues, add basic integration tests, and document the migration path before merging. The other issues can be addressed in follow-up PRs.

Approval: ⚠️ Approve with required changes - Fix critical security issues before merge.

Great work on this release! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant